Integrating Co-occurrence Statistics with Information Extraction for Robust Retrieval of Protein Interactions from Medline
نویسندگان
چکیده
The task of mining relations from collections of documents is usually approached in two different ways. One type of systems do relation extraction from individual sentences, followed by an aggregation of the results over the entire collection. Other systems follow an entirely different approach, in which co-occurrence counts are used to determine whether the mentioning together of two entities is due to more than simple chance. We show that increased extraction performance can be obtained by combining the two approaches into an integrated relation extraction model.
منابع مشابه
Biobibliometrics: Information Retrieval and Visualization from Co-occurrences of Gene Names in Medline
Successful information retrieval from biomedical literature databases is becoming increasingly difficult. We have developed a prototype system for retrieving and visualizing information from literature and genomic databases using gene names. The premise of our work is that, if two genes have a related biological function, the co-occurrence of two gene names (or aliases of those genes) within th...
متن کاملBiobibliometrics: information retrieval and visualization from co-occurrences of gene names in Medline abstracts.
Successful information retrieval from biomedical literature databases is becoming increasingly difficult. We have developed a prototype system for retrieving and visualizing information from literature and genomic databases using gene names. The premise of our work is that, if two genes have a related biological function, the co-occurrence of two gene names (or aliases of those genes) within th...
متن کاملImage retrieval using the combination of text-based and content-based algorithms
Image retrieval is an important research field which has received great attention in the last decades. In this paper, we present an approach for the image retrieval based on the combination of text-based and content-based features. For text-based features, keywords and for content-based features, color and texture features have been used. Query in this system contains some keywords and an input...
متن کاملMining Association Rules from Unstructured Documents
This paper presents a system for discovering association rules from collections of unstructured documents called EART (Extract Association Rules from Text). The EART system treats texts only not images or figures. EART discovers association rules amongst keywords labeling the collection of textual documents. The main characteristic of EART is that the system integrates XML technology (to transf...
متن کاملMining MEDLINE: Abstracts, Sentences, or Phrases?
A growing body of works address automated mining of biochemical knowledge from digital repositories of scientific literature, such as MEDLINE. Some of these works use abstracts as the unit of text from which to extract facts. Others use sentences for this purpose, while still others use phrases. Here we compare abstracts, sentences, and phrases in MEDLINE using the standard information retrieva...
متن کامل